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Abstract: The importance of estimating sample sizes is rarely understood by researchers, when planning a study. 
This paper aims to highlight the centrality of sample size estimations in health research. Examples that help in 
understanding the basic concepts involved in their calculation are presented. The scenarios covered are based 
more on the epidemiological reasoning and less on mathematical formulae. Proper calculation of the number of 
participants in a study diminishes the likelihood of errors, which are often associated with adverse consequences 
in terms of economic, ethical and health aspects. 
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INTRODUCTION 

Investigations in the health field are oriented by 
research problems or questions, which should be 
clearly defined in the study project. Sample size calcu- 
lation is an essential item to be included in the project 
to reduce the probability of error, respect ethical stan- 
dards, define the logistics of the study and, last but 
not least, improve its success rates, when evaluated 
by funding agencies. 

Let us imagine that a group of investigators 
decides to study the frequency of sunscreen use and 
how the use of this product is distributed in the "pop- 
ulation". In order to carry out this task, the authors 
define two research questions, each of which involv- 
ing a distinct sample size calculation: 1) What is the 
proportion of people that use sunscreen in the popu- 
lation?; and, 2) Are there differences in the use of sun- 
screen between men and women, or between individ- 
uals that are white or of another skin color group, or 
between the wealthiest and the poorest, or between 
people with more and less years of schooling? Before 
doing the calculations, it will be necessary to review a 
few fundamental concepts and identify which are the 
required parameters to determine them. 



WHAT DO WE MEAN, WHEN WE TALK ABOUT 
POPULATIONS? 

First of all, we must define what is a population. 
Population is the group of individuals restricted to a 
geographical region (neighborhood, city, state, coun- 
try, continent etc.), or certain institutions (hospitals, 
schools, health centers etc.), that is, a set of individu- 
als that have at least one characteristic in common. 
The target population corresponds to a portion of the 
previously mentioned population, about which one 
intends to draw conclusions, that is to say, it is a part 
of the population whose characteristics are an object 
of interest of the investigator Finally, studif population 
is that which will actually be part of the study, which 
will be evaluated and will allow conclusions to be 
drawn about the target population, as long as it is rep- 
resentative of the latter. Figure 1 demonstrates how 
these concepts are interrelated. 

We will now separately consider the required 
parameters for sample size calculation in studies that 
aim at estimating the frequency of events (prevalence 
of health outcomes or behaviors, for example), to test 
associations between risk/ protective factors and 
dichotomous health conditions (yes/no), as well as 
with health outcomes measured in numerical scales.' 
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Figure 1: 
Graphic repre- 
sentation of 
the concepts of 
population, 
target popula- 
tion and study 
population 



The formulas used for these calculations may be 
obtained from different sources - we recommend using 
the free online software OpenEpi (www.openepi.com)." 



WHICH PARAMETERS DOES SAMPLE SIZE CALCU- 
LATION DEPEND UPON FOR A STUDY THAT AIMS 
AT ESTIMATING THE FREQUENCY OF HEALTH 
OUTCOMES, BEHAVIORS OR CONDITIONS? 

When approaching the first research question 
defined at the beginning of this article (What is the 
proportion of people that use sunscreen?), the investi- 
gators need to conduct a prevalence study. In order to 
do this, some parameters must be defined to calculate 
the sample size, as demonstrated in chart 1. 

Chart 2 presents some sample size simulations, 
according to the outcome prevalence, sample error 
and the type of target population investigated. The 
same basic question was used in this table (prevalence 



Chart 1: Description of different parameters to be considered in the calculation of sample size for a study aiming at 
estimating the frequency of health ouctomes, behaviors or conditions 



Parameter 



Description 



Remark 



Population size 



Expected prevalence 
of outcome or event 
of interest 



Total population size from which the sample 
will be drawn and about which researchers 
will draw conclusions (target population) 



The study outcome must be a percentage, that 
is, a number that varies from 0% to 100%. 



Sample error 
estimate 



for 



Significance level 



Design effect 



The value we are willing to accept as error in 
the estimate obtained by the study. 



It is the probability that the expected preva- 
lence will be within the error margin being 
established. 

It is necessary when the study participants are 
chosen by cluster selection procedures. This 
means that, instead of the participants being 
individually selected (simple, systematic or 
stratified sampling), they are first divided and 
randomly selected in groups (census tracts, 
neighborhood, households, days of the week, 
etc.) and later the individuals are selected wit- 
hin these groups. Thus, greater similarity is 
expected among the respondents within a 
group than in the general population. This 
generates loss of precision, which needs to be 
compensated by a sample size adjustment 
(increase). 



Information regarding population size may be 
obtained based on secondary data from hospitals, 
health centers, census surveys (population, 
schools etc.). 

The smaller the target population (for example, 
less than 100 individuals), the larger the sample 
size will proportionally be. 

Information regarding expected prevalence rates 
should be obtained from the literature or by car- 
rying out a pilot-study. 

When this information is not available in the lite- 
rature or a pilot-study camrot be carried out, the 
value that maximizes sample size is used (50% for 
a fixed value of sample error). 

The smaller the sample error, the larger the sam- 
ple size and the greater the precision. In health 
studies, values between two and five percentage 
points are usually recommended. 

The higher the confidence level (greater expected 
precision), the larger will be the sample size. This 
parameter is usually fixed as 95%. 

The principle is that the total estimated variance 
may have been reduced as a consequence of clus- 
ter selection. The value of the design effect may 
be obtained from the literature. When not availa- 
ble, a value between 1.5 and 2.0 may be determi- 
ned and the investigators should evaluate, after 
the study is completed, the actual design effect 
and report it in their publications. 
The greater the homogeneity within each group (the 
more similar the respondents are within each clus- 
ter), the greater the design effect wiU be and the lar- 
ger the sample size required to increase precision. 
In studies that do not use cluster selection proce- 
dures (simple, systematic or stratified sampling), 
the design effect is considered as null or 1.0. 
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Chart 2: Sample size calculation to estimate the frequency (prevalence) of sunscreen use in the population, considering 
different scenarios but keeping the significance level (95%) and the design effect (1.0) constant 

Target Prevalence (p) of outcome 

population Sunscreen use at Sunscreen use Sunscreen use at 

work p=10% in sports p=35% the beach p=50''/o 



Acceptable Acceptable 
Error Error 
2 p.p. 5 p.p. 



Health center users 
investigated in a 
single day 
(population = 100) 

All users in the area 
covered by a health center 
(population size = 1,000) 

All users from the areas 
covered by all health 
centers in a city 
(population size = 10,000) 

The entire city population 
(N = 40.000) 

p.p.= percentage points 



of sunscreen use), but considering three different situ- 
ations (at work, while doing sports or at the beach), as 
in the study by Duquia et al. conducted in the city of 
Pelotas, state of Rio Grande do Sul, in 2005.-' 

The calculations show that, by holding the sam- 
ple error and the significance level constant, the high- 
er the expected prevalence, the larger will be the 
required sample size. However, when the expected 
prevalence surpasses 50%, the required sample size 
progressively diminishes - the sample size for an 
expected prevalence of 10% is the same as that for an 
expected prevalence of 90%. 

The investigator should also define beforehand 
the precision level to be accepted for the investigated 
event (sample error) and the confidence level of this 
result (usually 95%). Chart 2 demonstrates that, hold- 
ing the expected prevalence constant, the higher the 
precision (smaller sample error) and the higher the 
confidence level (in this case, 95% was considered for 
all calculations), the larger also will be the required 
sample size. 

Chart 2 also demonstrates that there is a direct 
relationship between the target population size and the 



Acceptable Acceptable Acceptable Acceptable 

Error Error Error Error 

2 p.p. 5 p.p. 2 p.p. 5 p.p. 

Sample 



number of individuals to be included in the sample. 
Nevertheless, when the target population size is suffi- 
ciently large, that is, surpasses an arbitrary value (for 
example, one million individuals), the resulting sample 
size tends to stabilize. The smaller the target popula- 
tion, the larger the sample will be; in some cases, the 
sample may even correspond to the total number of 
individuals from the target population - in these cases, 
it may be more convenient to study the entire target 
population, carrying out a census survey, rather than a 
study based on a sample of the population. 

SAMPLE CALCULATION TO TEST THE ASSOCI- 
ATION BETWEEN TWO VARIABLES: HYPOTHE- 
SES AND TYPES OF ERROR 

When the study objective is to investigate 
whether there are differences in sunscreen use accord- 
ing to sociodemographic characteristics (such as, for 
example, between men and women), the existence of 
association between explanatory variables (exposure 
or independent variables, in this case sociodemo- 
graphic variables) and a dependent or outcome vari- 
able (use of sunscreen) is what is under consideration. 



90 59 96 78 



97 



80 



464 122 687 260 707 



278 



796 137 1794 338 1937 



370 



847 138 2072 347 2265 381 
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In these cases, we need first to understand what the 
hypotheses are, as well as the types of error that may 
result from their acceptance or refutation. A hypothe- 
sis is a "supposition arrived at from observation or 
reflection, that leads to refutable predictions".* In 
other words, it is a statement that may be questioned 
or tested and that may be falsified in scientific studies. 

In scientific studies, there are two types of 
hypothesis: the null hypothesis (Hq) or original sup- 
position that we assume to be true for a given situa- 
tion, and the alternative hypothesis (H^) or addition- 
al explanation for the same situation, which we 
believe may replace the original supposition. In the 
health field, Hq is frequently defined as the equality or 
absence of difference in the outcome of interest 
between the studied groups (for example, sunscreen 
use is equal in men and women). On the other hand, 
Hy^ assumes the existence of difference between 
groups. Hy!^ is called two-tailed when it is expected that 
the difference between the groups will occur in any 
direction (men using more sunscreen than women or 
vice-versa). However, if the investigator expects to 
find that a specific group uses more sunscreen than 
the other, he will be testing a one-tailed H^. 

In the sample investigated by Duquia et al., the 
frequency of sunscreen use at the beach was greater in 
men (32.7%) than in women (26.2%).3 Although this 
what was observed in the sample, that is, men do wear 
more sunscreen than women, the investigators must 
decide whether they refute or accept Hq in the target 
population (which contends that there is no difference 
in sunscreen use according to sex). Given that the 
entire target population is hardly ever investigated to 
confirm or refute the difference observed in the sam- 
ple, the authors have to be aware that, independently 
from their decision (accepting or refuting Hq), their 
conclusion may be wrong, as can be seen in figure 2. 

In case the investigators conclude that both in 
the target population and in the sample sunscreen use 
is also different between men and women (rejecting 
Hg), they may be making a type 1 or Alpha error, 
which is the probability of rejecting Hq based on sam- 
ple results when, in the target population, Hq is true 
(the difference between men and women regarding 
sunscreen use found in the sample is not observed in 
the target population). If the authors conclude that 
there are no differences between the groups (accepting 
Hq), the investigators may be making a type II or Beta 
error, which is the probability of accepting Hg when, 
in the target population, Hq is false (that is, H^^ is 
true) or, in other words, the probability of stating that 
the frequency of sunscreen use is equal between the 
sexes, when it is different in the same groups of the 
target population. 



Result in the 
target population 



Results in 
the sample 



There is no 
difference in 

sunscreen use 
between the 
sexes 

(Accepted HO) 

There is 
difference in 
sunscreen use 
between the 

sexes 
(Rejected HO) 



There is no 


There is dif- 


difference in 


ference in 


sunscreen 


sunscreen 


use between 


use between 


the sexes 


the sexes 


(Ho true) 


(Ha true) 


CORRECT 


Error type 11 




(Beta) 


Error type I 


CORRECT 


(Alpha) 



Figure 2: Types of possible results when performing a 
hypothesis test 

In order to accept or refute Hq the investigators 
need to previously define which is the maximum 
probability of type I and II errors that they are willing 
to incorporate into their results. In general, the type I 
error is fixed at a maximum value of 5% (0.05 or con- 
fidence level of 95%), since the consequences originat- 
ed from this type of error are considered more harm- 
ful. For example, to state that an exposure/ interven- 
tion affects a health condition, when this does not 
happen in the target population may bring about 
behaviors or actions (therapeutic changes, implemen- 
tation of intervention programs etc.) with adverse 
consequences in ethical, economic and health terms. 
In the study conducted by Duquia et al., when the 
authors contend that the use of sunscreen was differ- 
ent according to sex, the p value presented (<0.001) 
indicates that the probability of not observing such 
difference in the target population is less that 0.1% 
(confidence level >99.9%).' 

Although the type II or Beta error is less harm- 
ful, it should also be avoided, since if a study contends 
that a given exposure/ intervention does not affect the 
outcome, when this effect actually exists in the target 
population, the consequence may be that a new med- 
ication with better therapeutic effects is not adminis- 
tered or that some aspects related to the etiology of the 
damage are not considered. This is the reason why the 
value of the type II error is usually fixed at a maxi- 
mum value of 20% (or 0.20). In publications, this value 
tends to be mentioned as the power of the study, 
which is the ability of the test to detect a difference, 
when in fact it exists in the target population (usually 
fixed at 80%, as a result of the 1-Beta calculation). 
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SAMPLE CALCULATION FOR STUDIES THAT 
AIM AT TESTING THE ASSOCIATION BETWEEN 
A RIS^ROTECTIVE FACTOR AND AN OUT- 
COME, EVALUATED DICHOTOMOUSLY 

In cases where the exposure variables are 
dichotomous (intervention/control, man/woman, 
rich/poor etc.) and so is the outcome (negative/ posi- 
tive outcome, to use sunscreen or not), the required 
parameters to calculate sample size are those 
described in chart 3. According to the previously men- 
tioned example, it would be interesting to know 
whether sex, skin color, schooling level and income 
are associated with the use of sunscreen at work, 
while doing sports and at the beach. Thus, when the 
four exposure variables are crossed with the three out- 
comes, there would be 12 different questions to be 
answered and consequently an equal number of sam- 
ple size calculations to be performed. Using the infor- 
mation in the article by Duquia et al."' for the preva- 
lence of exposures and outcomes, a simulation of sam- 
ple size calculations was used for each one of these sit- 
uations (Chart 4). 

Estimates show that studies with more power 
or that intend to find a difference of a lower magni- 
tude in the frequency of the outcome (in this case, the 
prevalence rates) between exposed and non-exposed 
groups require larger sample sizes. For these reasons, 
in sample size calculations, an effect measure between 
1.5 and 2.0 (for risk factors) or between 0.50 and 0.75 
(for protective factors), and an 80% power are fre- 
quently used. 

Considering the values in each column of chart 3, 
we may conclude also that, when the non- 
exposed/ exposed relationship moves away from one 
(similar proportions of exposed and non-exposed indi- 
viduals in the sample), the sample size increases. For 
this reason, intervention studies usually work with the 
same proportion of individuals in the intervention and 
control groups. Upon analysis of the values on each line, 
it can be concluded that there is an inverse relationship 
between the prevalence of the outcome and the required 
sample size. 

Based on these estimates, assuming that the 
authors intended to test all of these associations, it 
would be necessary to choose the largest estimated 
sample size (2,630 subjects). In case the required sam- 
ple size is larger than the target population, the inves- 
tigators may decide to perform a multicenter study, 
lengthen the period for data collection, modify the 
research question or face the possibility of not having 
sufficient power to draw valid conclusions. 

Additional aspects need to be considered in the 
previous estimates to arrive at the final sample size, 



which may include the possibility of refusals and/ or 
losses in the study (an additional 10-15%), the need 
for adjustments for confounding factors (an addition- 
al 10-20%, applicable to observational studies), the 
possibility of effect modification (which implies an 
analysis of subgroups and the need to duplicate or 
triplicate the sample size), as well as the existence of 
design effects (multiplication of sample size by 1.5 to 
2.0) in case of cluster sampling. 

SAMPLE CALCULATIONS FOR STUDIES THAT 
AIM AT TESTING THE ASSOCIATION 
BETWEEN A DICHOTOMOUS EXPOSURE AND 
A NUMERICAL OUTCOME 

Suppose that the investigators intend to evalu- 
ate whether the daily quantity of sunscreen used (in 
grams), the time of daily exposure to sunlight (in min- 
utes) or a laboratory parameter (such as vitamin D 
levels) differ according to the socio-demographic vari- 
ables mentioned. In all of these cases, the outcomes 
are numerical variables (discrete or continuous)', and 
the objective is to answer whether the mean outcome 
in the exposed/ intervention group is different from 
the non-exposed/ control group. 

In this case, the first three parameters from chart 4 
(alpha error, power of the study and relationship 
between non-exposed/ exposed groups) are required, 
and the conclusions about their influences on the final 
sample size are also applicable. In addition to defining 
the expected outcome means in each group or the 
expected mean difference between non- 
exposed/exposed groups (usually at least 15% of the 
mean value in non-exposed group), they also need to 
define the standard deviation value for each group. 
There is a direct relationship between the standard devi- 
ation value and the sample size, the reason why in case 
of asymmetric variables the sample size would be over- 
estimated. In such cases, the option may be to estimate 
sample sizes based on specific calculations for asymmet- 
ric variables, or the investigators may choose to use a 
percentage of the median value (for example, 25%) as a 
substitute for the standard deviation. 

SAMPLE SIZE CALCULATIONS FOR OTHER 
TYPES OF STUDY 

There are also specific calculations for some 
other quantitative studies, such as those aiming to 
assess correlations (exposure and outcome are numer- 
ical variables), time until the event (death, cure, 
relapse etc.) or the validity of diagnostic tests, but they 
are not described in this article, given that they were 
discussed elsewhere.' 
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Chart 3: Description of different parameters to be considered in the calculation of sample size for a study aiming at 
estimating the frequency of health ouctomes, behaviors or conditions 



Parameter 



Description 



Remark 



Type I or Alpha error It is the probability of rejecting HO, when HO It is expressed by the p value. It is usually 5% 
is false in the target population. Usually fixed (p<0.05). 



Statistical Power 
(1-Beta) 



as 5%. 



It is the ability of the test to detect a difference 
in the sample, when it exists in the target 
population. 

A value between 80%-90% is usually used. 



For sample size calculation, the confidence level 
may be adopted (usually 95%), calculated as 1- 
Alpha. 

The smaller the Alpha error (greater confidence 
level), the larger will be the sample size. 

Calculated as 1-Beta. 

The greater the power, the larger the required 
sample size will be. 



Relationship between It indicates the existing relationship between For observational studies, the data are usually 



non-exposed/ expo- 
sed groups in the 
sample 



non-exposed and exposed groups in the sample. 



Prevalence* of outco- 
me in the non-expo- 
sed group** (percen- 
tage of positive 
among the non-expo- 
sed) 



Expected prevalen- 
ce* ratio 



Type of statistical 
test 



Proportion of individuals with the disease 
(outcome) among those non-exposed to the 
risk factor (or that are part of the control 
group). 



Relationship between the prevalence* of 
disease in the exposed (intervention) group 
and the prevalence* of disease in the non- 
exposed group, indicating how many times it 
is expected that the prevalence* will be higher 
(or lower) in the exposed compared to non- 
exposed group. 

Usually, a value between 1.50 and 2.00 is used 
(exposure as risk factor) or between 0.50 and 
0.75 (protective factor). 



The test may be one-tailed or two-tailed, 
depending on the type of the HA. 



obtained from the scientific literature. In interven- 
tion studies, the value 1:1 is frequently adopted, 
indicating that half of the individuals will receive 
the intervention and the other half will be the 
control or comparison group. Some intervention 
studies may use a larger number of controls than 
of individuals receiving the intervention. 
The more distant this ratio is from one, the larger 
will be the required sample size. 

Data usually obtained from the literature. When 
this information is not available but there is infor- 
mation on general prevalence/ incidence in the 
population, this value may be used in sample size 
calculation (values attributed to the control group 
in intervention studies) or estimated based on the 
following formula: 
PONE=pO/(pNE+(pE*PR) ) 

where pO = prevalence of outcome; pNE = per- 
centage of non-exposed; pE = percentage of expo- 
sed; PR = prevalence* ratio (usually a value bet- 
ween 1.5 and 2.0). 

It is the value that the investigators intend to find 
as HA, with the corresponding HO equal to one 
(similar prevalence* of the outcome in both expo- 
sed and non-exposed groups). 
For the sample size estimates, the expected outco- 
me prevalence* may be used for the non-exposed 
group, or the expected difference in the prevalen- 
ce* between the exposed and the non-exposed 
groups. 

For intervention studies, the clinical relevance of 
this value should be considered. 
The smaller the prevalence rate (the smaller the 
expected difference between the groups), the lar- 
ger the required sample size. 

Two-tailed tests require larger sample sizes 



* It may be prevalence, incidence or risk, according to type of study; 
Ha - alternative hypothesis 



^ Non-exposed or control group; Ho - null hypothesis; 
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Chart 4: Sample size calculation to estimate the frequency (prevalence) of sunscreen use in the population, considering 
different scenarios but keeping the significance level (95%) and the design effect (1.0) constant 



Exposure 








Outcome Prevalence 










Sunscreen use at 
work p=13.7% 
p=13.7% 


Sunscreen use 
in sports 
p=30.2% 


Sunscreen use 
at the beach 
p=60.8% 






Expected 
PR 1.50 


Expected 
PR 2.00 


Expected 
PR 1.50 


Expected 
PR 2.00 


Expected 
PR 1.50 


Expected 
PR 2.00 


Sex: 

Female: 56%(E) 
Male:44%(NE) 
r: 0.79 


Power 

80% 
90% 


PONE: 10.7% 

n=1298 
n=1738 


n=388 
n=519 


PONE: 23.6% 

n=487 
n=652 


PONE: 47.5% 

n=134 
n=179 


n=136 
n=181 


n=28 
n=38 


Skin Color: 

Wliite: 82% (E) 
Other: 18%(NE) 
r: 0.22 


Power 

80% 
90% 


PONE: 9.7% 

n=2630 
n=3520 


n=822 
n=1100 


PONE: 21.4% 

n=970 
n=1299 


PDNE: 43.1% 

n=276 
n=370 


n=275 
n=368 


n=49 
n=66 


Schooling: 

0-4 years: 25%(E) 
>4 anos: 75%(NE) 
r: 3.00 


Power 

80% 
90% 


PONE: 12.2% 

n=1340 
n=1795 


n=366 
n=490 


PONE: 26.8% 

n=488 
n=654 


PONE: 54.0% 

n=131 
n=175 


n=138 
n=184 


ND 
ND 


Per capita income: 

<133: 50%(E) 
>133: 50%(NE) 
r: 1.00 


Power 

80% 
90% 


PONE: 11.0% 

n=1228 
n=1644 


n=360 
n=480 


PONE: 24.2% 

n=458 
n=612 


PONE: 48.6% 

n=124 
n=166 


n=128 
n=170 


n=28 
n=36 



E=exposed group; NE=non-exposed group; r=NE/E relationship; PONE=prevalence of outcome in ihe non-exposed group (percentage of positi- 
ves in non-exposed group), estiniated based on formula from chart 3, considering an PR of 1.50; PR=prevalence ratio/ incidence or expected relati- 
ve risk; n= minimum necessary sample size; ND=value could not be determined, as prevalence of outcome in the exposed would be above 100%, 
according to specified parameters. 



CONCLUSION 

Sample size calculation is always an essential 
step during the planning of scientific studies. An 
insufficient or small sample size may not be able to 
demonstrate the desired difference, or estimate the 



frequency of the event of interest with acceptable pre- 
cision. A very large sample may add to the complexi- 
ty of the study, and its associated costs, rendering it 
unfeasible. Both situations are ethically unacceptable 
and should be avoided by the investigator. □ 
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